Skip to content

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Dec 19, 2025

This PR introduces the combine operation as discussed in the RaggedIterDomain design doc.

One design decision that I changed from the original design doc is about detecting and validating component iter domains. Previously, I was thinking about using the exact graph to find the corresponding component iter domain for a given ragged iter domain (e.g., #5550 (comment)). However, it won't work, for example, when a fusion is segmented and a segment does not have the corresponding Partition expr for a RaggedIterDomain. For example, when a tensor is used as an input for asNested, followed by some other operations, if the fusion is segmented after some operations, the latter segment won't be able to see the asNested and the Partition operations as they don't exist in the segment. This could be alleviated by providing an exact graph for the whole complete fusion, but more fundamentally, if a fusion has a nested tensor as an input, there doesn't seem to be any reasonable way to attach a Partition expr.

See doc/dev/ragged_iter_domain_combine_design_doc.md‎ for detailed discussions. At this moment, I decided to not worry too much about the validation and assume the correctness is guaranteed by the user.

Note that partitioning is still limited to 1D extents. Multi-dim offsets will be the next step of this series of RPs.

@naoyam
Copy link
Collaborator Author

naoyam commented Dec 19, 2025

!test

@github-actions
Copy link

github-actions bot commented Dec 19, 2025

Review updated until commit be0e2ea

Description

  • Implement RaggedIterDomain::combine() method to flatten ragged structures back to regular IterDomains

  • Add Combine expression class to represent the inverse of Partition operation

  • Include comprehensive validation for component-ragged pairing when Partition definition exists

  • Provide extensive test coverage including basic combine, validation cases, and asNested integration

  • Add detailed design document explaining validation strategy and architectural decisions

Changes walkthrough

Relevant files
Enhancement
internal_base_nodes.cpp
Implement combine method for RaggedIterDomain                       

csrc/ir/internal_base_nodes.cpp

  • Implement RaggedIterDomain::combine() method with input validation and
    Combine expression creation
  • Add checks for null inputs, type compatibility, and parallel type
    constraints
  • Validate component-ragged pairing when Partition definition is
    available
  • Create symbolic combined extent as sum of ragged extents
  • +100/-0 
    internal_nodes.cpp
    Add Combine expression class                                                         

    csrc/ir/internal_nodes.cpp

  • Add Combine class definition with component and ragged inputs
  • Implement toString() and toInlineString() methods for Combine
    expression
  • Add clone and create functionality for Combine operation
  • +27/-0   
    dispatch.h
    Register Combine in dispatch system                                           

    csrc/dispatch.h

    • Add Combine to the dispatch macro list for IR expression handling
    +1/-0     
    internal_base_nodes.h
    Declare combine method in header                                                 

    csrc/ir/internal_base_nodes.h

  • Add RaggedIterDomain::combine() method declaration with detailed
    documentation
  • Include parameter descriptions and usage examples
  • +16/-0   
    internal_nodes.h
    Declare Combine class in header                                                   

    csrc/ir/internal_nodes.h

  • Add Combine class declaration with input/output accessors
  • Include component(), ragged(), and out() accessor methods
  • +38/-0   
    Tests
    test_ragged_iter_domain.cpp
    Add comprehensive combine operation tests                               

    tests/cpp/test_ragged_iter_domain.cpp

  • Add TEST_F(RaggedIterDomainTest, CombineBasic) for basic combine
    functionality
  • Add validation tests for null inputs and type mismatches
  • Add TEST_F(RaggedIterDomainTest, AsNestedThenCombine) for integration
    testing
  • Add TEST_F(RaggedIterDomainTest, AsNestedThenSetThenCombine) for
    post-propagation testing
  • +195/-0 
    Documentation
    ragged_iter_domain_combine_design_doc.md
    Add detailed design documentation                                               

    doc/dev/ragged_iter_domain_combine_design_doc.md

  • Add comprehensive design document explaining combine operation
    rationale
  • Compare validation alternatives (stored pointer, IR traversal,
    partition-only, TensorDomain pairing)
  • Document chosen approach: validate when Partition exists, trust user
    otherwise
  • Include implementation notes and future considerations
  • +355/-0 

    PR Reviewer Guide

    Here are some key observations to aid the review process:

    🧪 PR contains tests
    ⚡ Recommended focus areas for review
    Symbolic Extent Handling

    The combined extent is created as a symbolic Val without concrete value computation. While this may be appropriate for symbolic execution, consider if there are scenarios where the actual sum of extents is needed for analysis or optimization passes.

    Val* combined_extent =
        IrBuilder::createInContainer<Val>(container, DataType::Index);
    Validation Limitations

    The validation only checks component-ragged pairing when a direct Partition definition exists. After operations like set() or in segmented fusions, validation is skipped and user correctness is assumed. This could lead to runtime errors if incorrect components are provided.

    if (ragged->definition() != nullptr &&
        ragged->definition()->isA<Partition>()) {
      auto* partition = ragged->definition()->as<Partition>();
      IterDomain* expected_component = partition->component();
    
      NVF_ERROR(
          component == expected_component,
          "combine: component mismatch. The provided component does not match ",
          "the component from the Partition that created this "
          "RaggedIterDomain.\n",
          "  Provided component: ",
          component->toString(),
          "\n",
          "  Expected component: ",
          expected_component->toString());
    }
    1D Extent Assumption

    The code assumes the extents tensor is 1D and validates this assumption. This aligns with the current limitation mentioned in the PR description, but consider if multi-dimensional extents support is planned and how this validation should be updated.

    NVF_ERROR_EQ(
        std::ranges::distance(extents_tv->getLogicalDomain() | TensorDomain::kNoReductions),
        1,
        "Unexpected rank of extent tensor: ",
        extents_tv->toString());

    @greptile-apps
    Copy link
    Contributor

    greptile-apps bot commented Dec 19, 2025

    Greptile Summary

    This PR introduces the Combine operation for RaggedIterDomain, which is the inverse of Partition. The operation combines a component IterDomain with a RaggedIterDomain back into a single regular IterDomain.

    Key changes:

    • Adds RaggedIterDomain::combine() static method that validates inputs and creates a combined IterDomain with symbolic extent
    • Implements new Combine expression class that mirrors Partition structure (inputs: component + ragged, output: combined)
    • Uses Option 3 validation approach: validates component-ragged pairing only when direct Partition definition exists, otherwise trusts user
    • Unlike Partition which stores extents as an attribute, Combine accesses extents via ragged->extents() without storing it
    • Creates symbolic extent Val for combined dimension rather than computing actual sum
    • Includes comprehensive test coverage and detailed design documentation

    The implementation follows nvFuser's design philosophy of trusting user-provided inputs (similar to arithmetic operations), with best-effort validation when feasible.

    Confidence Score: 4/5

    • This PR is safe to merge with low risk - well-designed foundation for ragged domain operations
    • Score reflects solid implementation with thorough validation and testing, though symbolic extent approach defers some complexity to lowering phase. The Option 3 design choice is pragmatic and well-documented. No critical bugs found, but score is 4/5 because: (1) symbolic extent leaves relationship to actual sum implicit, which may complicate future lowering/indexing, and (2) inconsistency with Partition in not storing extents as attribute could cause issues if ragged domain loses connection to extents through transformations
    • csrc/ir/internal_base_nodes.cpp - the symbolic extent approach should be monitored during lowering implementation to ensure proper index computation

    Important Files Changed

    Filename Overview
    csrc/ir/internal_base_nodes.cpp Implements RaggedIterDomain::combine() with validation, symbolic extent creation, uses Option 3 design approach
    csrc/ir/internal_nodes.h Added Combine expression class definition with accessor methods, mirrors Partition structure
    csrc/ir/internal_nodes.cpp Implements Combine constructor and string methods, follows existing Partition pattern
    tests/cpp/test_ragged_iter_domain.cpp Added comprehensive test coverage for combine() including basic usage, validation, and propagation scenarios

    Sequence Diagram

    sequenceDiagram
        participant User
        participant RaggedIterDomain
        participant Combine as Combine Expr
        participant IterDomain
        participant Partition as Partition Expr
    
        User->>RaggedIterDomain: combine(component, ragged)
        
        RaggedIterDomain->>RaggedIterDomain: Validate component != null
        RaggedIterDomain->>RaggedIterDomain: Validate ragged != null
        RaggedIterDomain->>RaggedIterDomain: Validate component is not RaggedIterDomain
        RaggedIterDomain->>RaggedIterDomain: Validate parallel types are Serial
        RaggedIterDomain->>RaggedIterDomain: Validate iter types are Iteration
        
        alt ragged has Partition definition
            RaggedIterDomain->>Partition: Get expected component
            Partition-->>RaggedIterDomain: Return component IterDomain
            RaggedIterDomain->>RaggedIterDomain: Validate component matches expected
        else No Partition definition
            Note over RaggedIterDomain: Trust user (Option 3)
        end
        
        RaggedIterDomain->>RaggedIterDomain: Get extents from ragged
        RaggedIterDomain->>RaggedIterDomain: Validate extents is 1D
        RaggedIterDomain->>RaggedIterDomain: Create symbolic extent Val
        
        RaggedIterDomain->>IterDomain: Create combined IterDomain
        IterDomain-->>RaggedIterDomain: Return combined_id
        
        RaggedIterDomain->>Combine: Create Combine expression
        Combine->>Combine: addOutput(combined_id)
        Combine->>Combine: addInput(component)
        Combine->>Combine: addInput(ragged)
        
        RaggedIterDomain-->>User: Return combined IterDomain
    
    Loading

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    5 files reviewed, 2 comments

    Edit Code Review Agent Settings | Greptile

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    6 files reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    7 files reviewed, 1 comment

    Edit Code Review Agent Settings | Greptile

    @naoyam naoyam changed the title [WIP] Combine (merge) for RaggedIterDomain Combine for RaggedIterDomain Dec 19, 2025
    Copy link
    Contributor

    @greptile-apps greptile-apps bot left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    No files reviewed, no comments

    Edit Code Review Agent Settings | Greptile

    @naoyam
    Copy link
    Collaborator Author

    naoyam commented Jan 13, 2026

    !test

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants